Iterative Bayesian and MMSE-based noise compensation techniques for speaker recognition in the i-vector space
نویسندگان
چکیده
Dealing with additive noise in the i-vector space can be challenging due to the complexity of its effect in that space. Several compensation techniques have been proposed in the last years to either remove the noise effect by setting a noise model in the i-vector space or build better scoring techniques that take environment perturbations into account. We recently presented a new efficient Bayesian cleaning technique operating in the ivector domain named I-MAP that improves the baseline system performance by up to 60%. This technique is based on Gaussian models for the clean and noise i-vectors distributions. After IMAP transformation, these hypothesis are probably less correct. For this reason, we propose to apply another MMSE-based approach that uses the Kabsch algorithm. For a certain noise, it estimates the best translation vector and rotation matrix between a set of train noisy i-vectors and their clean counterparts based on RMSD criterion. This transformation is then applied on noisy test i-vectors in order to remove the noise effect. We show that applying the Kabsch algorithm allows to reach a 40% relative improvement in EER(%) compared to a baseline system performance and that, when combined with I-MAP and repeated iteratively, it allows to reach 85% of relative improvement. keywords: i-vector, additive noise, Kabsch algorithm, IMAP
منابع مشابه
Probabilistic Approach Using Joint Clean and Noisy i-Vectors Modeling for Speaker Recognition
Additive noise is one of the main challenges for automatic speaker recognition and several compensation techniques have been proposed to deal with this problem. In this paper, we present a new ”data-driven” denoising technique operating in the i-vector space based on a joint modeling of clean and noisy i-vectors. The joint distribution is estimated using a large set of i-vectors pairs (clean i-...
متن کاملRobust Speaker Recognition Using MAP Estimation of Additive Noise in i-vectors Space
In the last few years, the use of i-vectors along with a generative back-end has become the new standard in speaker recognition. An i-vector is a compact representation of a speaker utterance extracted from a low dimensional total variability subspace. Although current speaker recognition systems achieve very good results in clean training and test conditions, the performance degrades considera...
متن کاملمقایسه روش های طیفی برای شناسایی زبان گفتاری
Identifying spoken language automatically is to identify a language from the speech signal. Language identification systems can be divided into two categories, spectral-based methods and phonetic-based methods. In the former, short-time characteristics of speech spectrum are extracted as a multi-dimensional vector. The statistical model of these features is then obtained for each language. The ...
متن کاملRobustness in ASR: An Experimental Study of the Interrelationship between Discriminant Feature-Space Transformation, Speaker Normalization and Environment Compensation
This thesis addresses the general problem of maintaining robust automatic speech recognition (ASR) performance under diverse speaker populations, channel conditions, and acoustic environments. To this end, the thesis analyzes the interactions between environment compensation techniques, frequency warping based speaker normalization, and discriminant feature-space transformation (DFT). These int...
متن کاملI-vector based speaker recognition using advanced channel compensation techniques
This paper investigates advanced channel compensation techniques for the purpose of improving i-vector speaker verification performance in the presence of high intersession variability using the NIST 2008 and 2010 SRE corpora. The performance of four channel compensation techniques: (a) weighted maximum margin criterion (WMMC), (b) source-normalized WMMC (SN-WMMC), (c) weighted linear discrimin...
متن کامل